Learning a Robust Consensus Matrix for Clustering Ensemble via Kullback-Leibler Divergence Minimization

نویسندگان

  • Peng Zhou
  • Liang Du
  • Hanmo Wang
  • Lei Shi
  • Yi-Dong Shen
چکیده

Clustering ensemble has emerged as an important extension of the classical clustering problem. It provides a framework for combining multiple base clusterings of a data set to generate a final consensus result. Most existing clustering methods simply combine clustering results without taking into account the noises, which may degrade the clustering performance. In this paper, we propose a novel robust clustering ensemble method. To improve the robustness, we capture the sparse and symmetric errors and integrate them into our robust and consensus framework to learn a low-rank matrix. Since the optimization of the objective function is difficult to solve, we develop a block coordinate descent algorithm which is theoretically guaranteed to converge. Experimental results on real world data sets demonstrate the effectiveness of our method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Abstract Weighting Framework for Clustering Algorithms

Recent works in unsupervised learning have emphasized the need to understand a new trend in algorithmic design, which is to influence the clustering via weights on the instance points. In this paper, we handle clustering as a constrained minimization of a Bregman divergence. Theoretical results show benefits resembling those of boosting algorithms, and bring new modified weighted versions of cl...

متن کامل

A Robust Symmetric Nonnegative Matrix Factorization Framework for Clustering Multiple Heterogeneous Microbiome Data

Integration of multi-view datasets which are comprised of heterogeneous sources or different representations is challenging to understand the subtle and complex relationship in data. Such data integration methods attempt to combine efficiently the complementary information of multiple data types to construct a comprehensive view of underlying data. Nonnegative matrix factorization (NMF), an app...

متن کامل

Exploiting the Limits of Structure Learning via Inherent Symmetry

This theoretical paper is concerned with the structure learning limit for Gaussian Markov random fields from i.i.d. samples. The common strategy is applying the Fano method to a family of restricted ensembles. The efficiency of this method, however, depends crucially on selected restricted ensembles. To break through this limitation, we analyze the whole graph ensemble from a group theoretical ...

متن کامل

An Introduction to Inference and Learning in Bayesian Networks

Bayesian networks (BNs) are modern tools for modeling phenomena in dynamic and static systems and are used in different subjects such as disease diagnosis, weather forecasting, decision making and clustering. A BN is a graphical-probabilistic model which represents causal relations among random variables and consists of a directed acyclic graph and a set of conditional probabilities. Structure...

متن کامل

Model Confidence Set Based on Kullback-Leibler Divergence Distance

Consider the problem of estimating true density, h(.) based upon a random sample X1,…, Xn. In general, h(.)is approximated using an appropriate in some sense, see below) model fƟ(x). This article using Vuong's (1989) test along with a collection of k(> 2) non-nested models constructs a set of appropriate models, say model confidence set, for unknown model h(.).Application of such confide...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015